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We present a numerical Monte Carlo analysis of a continuos spin Ising chain that can describe 
the statistical proterties of folded proteins. We find that depending on the value of the Metropolis 
temperature, the model displays the three known nontrivial phases of polymers: At low temperatures 
the model is in a collapsed phase, at medium temperatures it is in a random walk phase, and at 
high temperatures it enters the self-avoiding random walk phase. By investigating the temperature 
dependence of the specific energy we confirm that the transition between the collapsed phase and 
the random walk phase is a phase transition, while the random walk phase and self-avoiding random 
walk phase are separated from each other by a cross-over transition. We also compare the predictions 
of the model to a phenomenological elastic energy formula, proposed by Huang and Lei to describe 
folded proteins. 



I. INTRODUCTION 

The concept of universality [T] , [2] divides critical phys- 
ical systems into universality classes that differ from each 
other essentially only by their space-time dimensionality 
and the symmetry group of their order parameter. This 
enables the computation of critical properties for an en- 
tire class of physical systems using only a single repre- 
sentative model. In the case of polymers one expects 
that there are three different nontrivial phases and these 
correspond to the universality class of self-avoiding ran- 
dom walk (SARW), to the universality class of Brownian 
motion i. e. ordinary random walk (RW) , and to the uni- 
versality class of polymer collapse [31. These phases are 
each characterized by the different values of certain crit- 
ical exponents that describe the scaling properties of the 
polymer in the limit where the number N of monomers 
becomes large. The most widely used critical exponent, 
the compactness index u, computes the inverse of the 
Hausdorff dimension of the polymer. It can be introduced 
by considering how the polymer's radius of gyration R g 
increases in the number of monomers, asymptotically for 
large values of N 0], 
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Here (i = 1,2,..., TV) are the locations of the N 
monomers in M 3 . The critical exponents v and Ai are 
universal quantities. But the form factor Rq that char- 
acterizes the effective distance between the monomers in 
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the large N limit, and the amplitude /3i that parametrizes 
the leading finite size corrections, are not. The asymp- 
totic expansion ([I]) is an example of a general result [5] , 
[B] that states, that when N becomes large the mean 
value of any global observable O of a polymer should 
behave like 
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where the exponents are universal, but the pre-factor and 
the various amplitudes are all non-universal. 

For a polymer the compactness index has the following 
mean field (mf) values [3]: 
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As a function of temperature, the collapsed phase occurs 
at low temperatures (bad solvent) while the SARW de- 
scribes the high temperature (good solvent) behavior of 
polymers. The random walk phase takes place at the 
0-temperature that separates the SARW phase from the 
collapsed phase. In general the mean field values of the 
critical exponents acquire corrections due to fluctuations, 
and for the universality class of the self-avoiding random 
walk the improved values are v — 0.5880 ± 0.0015 and 
Ai = 0.47 ± 0.03. These values were obtained in [7] 
by utilizing the concept of universality that relates the 
self-avoiding random walk with the n — > component 
</> 4 field theory [S]. The subsequent direct Monte Carlo 
evaluation reported in j6] gave the very similar values 
v = 0.5877 ± 0.0006 and Ai = 0.56 ± 0.03, in line with 
the concept of universality. 
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Qualitatively, at the level of a mean field theory the 
phase structure of a polymer can be described in terms 
of the Flory-Huggins theory [3] . For this we characterize 
the polymer concentration by an order parameter <p(x), 
with < <p{x) < 1. At low concentrations the polymer 
free energy density (per temperature) has the Landau 
expansion 

7j,E[(j>} = 7?(V0) 2 + 7 -01n0+i(l-2 X )0 2 + |0 3 + ... (4) 

Here 77, 7, x and g are parameters. The first term is a 
stiffness term. The second term describes entropy con- 
trubutions. The third term describes monomer-monomer 
interactions; the (Flory) interaction parameter \ is gener- 
ically a decreasing function of temperature. The last 
term characterizes the three-body (and higher order con- 
tributions) monomer interactions. The phase structure 
can be exposed by by ignoring the stiffness term and by 
minimizing the remaining potential energy contribution 
to free energy. With proper relative values of the pa- 
rameters the potential has a form that is familiar from 
spontaneous symmetry breaking: When the ground state 
expectation value < <fi > is non- vanishing we are in the 
collapsed phase while the vanishing value im- 

plies that we are in the universality class of self-avoiding 
random walk. The border line that separates these two 
phases determines the temperature where the polymer 
is in the universality class of random walk. It occurs at 
that value of temperature (or denaturant concentration) 
for which the excluded volume parameter vanishes, and 
to first order 

1-2 X (T ) = 

Thus, for x(T) > 1/2 we are in the collapsed phase while 
for x(T) < 1/2 we enter the SARW phase and in par- 
ticular at the 0-point the tfi 2 (i.e. mass) contribution to 
the free energy is absent. 

Here we shall present results of an extensive numerical 
analysis of the polymer phase structure. Our approach 
is based on the chiral homopolymer model introduced in 
[9] . The applicability of the model to analyze the proper- 
ties of chiral polymers in all three phases can be justified 
by the concept of universality. Indeed, the derivation of 
the model in [9] is very much based on the universality 
concept: The model accounts for the monomer complex- 
ity, the presence of amino acid side chains in proteins, 
and polymer-solvant interactions in an effective manner. 
In particular, the model appears to describe certain uni- 
versal properties of the folded proteins [10] in the Protein 
Data Bank (PDB) |TT] with a very high accuracy. More 
recently, it has also been shown [TJ], [T3] that the model 
supports dark solitons and the presence of these solitons 
appears to be related to the emergence of the collapsed 
phase. These solitons can also describe folded proteins in 
PDB with a subatomic accuracy of less than 1 Angstrom 
in root mean square distance (RMSD). This also moti- 
vates us to compare our results to a recently presented 
phenomenological model of protein folding |14j . 



II. THE MODEL 

The model introduced in [9] is defined by the following 
internal energy, 

E = {1 - cos [u;y - /%)]} (5) 

{ b ^ T ? + c * ( K ? ~ m Tf + d * T *} 

i 

Here i,j — 1,...,N label the N monomers of a (chi- 
ral) polygonal chain in R 3 . These monomers are located 
at the vertices of the polygon, and the chain geometry 
changes when the polymer fluctuates in R 3 . The geom- 
etry is determined by the order parameter re^ that is a 
discrete lattice version of the Frenet curvature, and by 
the order parameter n that is the lattice version of the 
Frenet torsion [9]. Once the values of (k^t^) for each 
i = 1, N are given the actual shape of the polymer as 
a polygonal chain in the three dimensional space can be 
computed by integrating the appropriate discrete version 
of the Frenet equations. This integration introduces pa- 
rameters <5i, the three dimensional distances between the 
monomers. 

The a,ij,ujij,bi, ...,di in ([5]) are parameters. The first 
sum in the free energy describes long-distance interac- 
tions, we have introduced the cosine function to tame 
excessive fluctuations in Ki in the numerical simulations. 
In the second sum the first term describes the interaction 
between Ki and 77, and the second term describes the self- 
interaction of Hi. Finally, the last term is a discretized 
version of the one dimensional Chern-Simons functional, 
it is the origin of chirality in the polymer chain [9] , with 
handedness that depends on the sign of d, t . 

For a general polymer the quantities ((%■, bi, Ci, 
Hi,di) are a priori site-dependent parameters, and dif- 
ferent values of these parameters can be used to describe 
different kind of monomer (amino acid) structures. For 
generic Oij ((5j) is a spin-class model. Here we shall be 
interested in the limiting case of a homopolymer where 
we restrict ourselves to only the nearest neighbor inter- 
actions with 

_ [a ■ (S u+1 + £ M _i) (» = 2, N - 1) 
ij \ a (» = l,j = 2) & (i = N — 1, j = N) 

(6) 

and we also select all the remaining parameters to be 
independent of the site index i. Thus the model in the 
form studied here reads 

E = \ a {1 — cos [w (Ki — Kj)}} 



+E{^+ c (^-™ 2 ) 2 + ^} ( 7 ) 

i 

where the first sum extends over the nearest neighbors; 
Notice that since the overall scale of the parameters a, b, c 
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and d can be absorbed into the definition of the scale 
of the Metropolis temperature T, as it stands there are 
five independent intrinsic parameters. Consequently the 
scale of energy, say in electronvolts, remains indetermi- 
nate and should be defined by (re)normalization at some 
convenient value of T. We also note that classically, 
the model |7]) has a ground state which is a helix, with 
Ki f=a ±m. 

We select the numerical values of the parameters in 
a manner that allows for a direct statistical comparison 
to PDB data. These values have been found by a trial- 
and-error comparison with PDB data [9] and they are 
shown in Table [I] Furthermore, we shall assume that the 



Parameter 


Value 


a 




4 






4.25 


b 


5. 


488 ■ 10~ 4 


c 




0.5 


m 




24.7 


d 




-20 



TABLE I: Parameter values of the model ([5| that we use in 
our simulations 



distances 5i between the monomers that we need to in- 
troduce when we integrate the discrete Frenet equations 
to construct the polygonal chain in M 3 , have the fixed 
value 



-i\ = 6 = 3.8 (A) i = 2, 



, N. 



(8) 



This value (in A) is chosen to coincide with the average 
distance between C a carbons in the backbone of PDB 
proteins. Finally, we exclude steric clashes by demanding 
that the distance between any two monomers satisfies the 
bound 



Vi -tj\>z = 3.7 (A) for 
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> 2. 



(9) 



Again, this numerical value has been chosen to match the 
protein data in PDB. 

We have used the standard Metropolis algorithm to 
simulate the model The initial configuration is a 

straight rod with Ki — T% = 0. Each Monte-Carlo step 
consists of a shift of the curvature and torsion by a typical 
value of Ak.; = Ar^ = 0.05. This shift is accepted with 
the probability 
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P = min 1, exp — 



where T is the Metropolis temperature. We use this tem- 
perature as an external parameter that allows us to probe 
the different phases of the polymer. 

The simulations proceeded as follows: For each tem- 
perature value, between 10 and 16 different polymer 
lengths was selected. The number of the Monte-Carlo 



iterations of each chain was chosen to be 11.000 multi- 
plied by the number N of monomers in the polymer. We 
created around 200 or more polymers for each individ- 
ual temperature value T and monomer number N, with 
less for the extremely long and the highest temperature 
curves. The shortest polymers in our simulations had 50 
monomers, and the longest ones had 1.800 monomers. 
These values were chosen to be representative of the sin- 
gle domain proteins in PDB. 

Finally, since the free energy Q is quadratic in 
Tj and furthermore since only interacts locally, we can 
eliminate it by using its equation of motion 

dE 
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This gives us 



E = }ajj {1 - cos [ujj (m - Kj)]} 
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and in the limit of uniform chain and small u> we get 
(after we add boundary contributions and choose n Q — 

KJV+1 = 0) 
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We recognize here a version of the continuos spin Ising 
chain [T5]: Indeed, the only difference between (11) and 



the conventional continuous spin Ising chain is in the 
presence of the last term in (11). We note that this last 



term that has its origin in (10), is quite reminiscent of 



the potential term that appears in the widely studied 
Calogero model [16], for the relative coordinate in the 
two-body case. Furthermore, if we absorb the parame- 
ter combination acu 2 into the definition of overall scale of 



temperature , in (10), (11) there are only four indepen- 
dent parameter combinations. 

It has been a commonly held point of view [T7] that 
the lattice version of the 4> 4 model is always in the same 
universality class with the pure Ising model. But this has 
been disputed in the one dimensional case by explicit 
computations e.g. in [18] . Here we have an additional 



interaction term, the last Calogero- type term in ( 11 ), and 
we shall explicitely show that the ensuing phase structure 
is highly nontrivial. 



III. THE RADIUS OF GYRATION 

We shall first investigate the radius of gyration |T]) with 
the goal to confirm that the model [5] does indeed de- 



4 



scribe the three different polymer phases characterized 
by the mean field values ([3| of the critical exponent v. 
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FIG. 1: The radius of R a as a function of temperature T and 
the number of monomers N. The three putative phases are 
identified with the putative position of the ensuing critical 
temperatures, denoted by the vertical planes. 

In Figure [I] we show how the radius of gyration R g de- 
pends on the (Metropolis) temperature T and the num- 
ber of monomers N. In Figure [2] we depict the T depen- 
dence of v, and figure [3] shows the T dependence of the 




FIG. 2: The compactness index v vs. temperature T. The 
vertical lines correspond to temperature values where v[T) 
reaches the mean field values p}. 

pre- factor Rq in ([lj. 



A. Collapsed phase 

In the figure [l] we clearly identify a low-temperature 
phase which is the putative collapsed phase. In this phase 
R g is constant, or has only very weak T dependence, and 
we can fit the data with very high accuracy using the 
following relation, 



Rg — RoN y , 



(12) 
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FIG. 3: The pre-factor Ro in |TJ vs. temperature T. The ver- 
tical lines correspond to the temperature values where u(T) 
in figure [5] reaches the mean field values p}. 



where Rq and v are the fitting parameters. From the 
data in figure [2] we estimate in the low temperature limit 



0.348 ± 0.007 



(13) 



This is so close to the mean field value v = 1/3 of the 
collapsed phase, that obviously we are in that phase. 

The parameter Rq that we present in Fig. [3] describes 
the effective distance between the monomers. In the col- 
lapsed phase we estimate in the low temperature limit 



Ro 



3.25 ±0.15 (A) 



This is clearly smaller than the bare value (|8j) in 
our model, proposing that in the collapsed phase the 
monomers have the tendency to become more densely 
packed also along the polymer chain. 

From our data we are not able to deduct any non- 
vanishing value for the sub-leading critical exponent Ai 
inQ. 



W 



ren the temperature increases beyond log 10 T ~ 0, 



v(T) starts increasing and we enter a transition region 
between the collapsed phase and the putative random 
walk phase. At the same time as the value of v starts 
increasing, the value of Rq decreases and when tempera- 
ture approaches the value [19] 



Tpdb 



3.81 ± 1.52 



we obtain the fit 



R g w 2.8 • iV ' 38 



(14) 



(15) 



for 50 < N < 1.800 which is very close to the estimate 
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(16) 



that describes the dependence of the radius of gyration on 
the number N of C a carbons for all single strand proteins 
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in PDB with 75 < N < 1.000. This suggests that the 
model probably gives its best approximation to the PDB 
data in its collapsed phase, near the transition to the 
random walk phase. However, we point out that when 
T ps Tpdb both u{T) and Rq(T) have a quite strong 
temperature dependence, indicative of vicinity of a phase 
transition that makes the accuracy of our estimates prone 
to relatively large errors, and for more precise estimates 
one needs simulations with substantially more computer 
time. 



B. RW and SARW phases 

In Figure [4] we display how the radius of gyration de- 
pends on the number of monomers N for a range of values 
of temperature beyond the collapsed phase, and compare 
the data with a fit of the form (12). As visible in this 



figure, even beyond the collapsed phase the data can be 
fitted with very high accuracy by the relation (12 1. How- 



ever, unlike in the collapsed phase where the radius of 
gyration is practically temperature independent, both in 
the putative random walk phase and in the putative self- 
avoiding random walk phase the radius of gyration is a 
slowly but monotonically increasing function of the tem- 
perature. 




FIG. 4: The radius of gyration R g 



the number of 



monomers N at different values of temperature T. The er- 
ror bars are of the same order or smaller than the size of the 
symbols used. The dashed lines represent the fits by equa- 



tion ( 12 1 



The transition from the collapsed phase to the putative 
random walk phase is very visible in our figures |2] and 
U There is a clear, rapid transition in both v(T) and 
Ro(T), reminiscent of a phase transition. From figure [2] 
we estimate that at the transition point v is very close 
to the value 

1 

v ps — 
2 

which is the mean field value for the O-point. For T > 
To, the compactness index v{T) is a slowly increasing 



function of temperature that eventually plateaus around 
the value 



0.58 



This is slightly above the 6-point value, but slightly be- 
low the SARW values reported in [7] , [B] . Since the com- 
pactness index u(N) appears to have a tendency to ap- 
proach its large- A limit from above [S] , we conclude that 
we are in the RW phase. 

For the effective monomer distance i?o we find the 
value 



Ro 



1.67 ±0.03 (A) 



which is clearly lower than the bare value ([8l . 

In general one expects that the transition between the 
collapsed phase and the RW phase is a phase transition, 
while the transition between the RW and SARW phases 
is a smooth cross-over [3] . The results in figures T|3 ; 



line with this, the transition between the RW phase and 
the putative self-avoiding random walk phase is much less 
dramatic than the transition between the collapsed phase 
and the RW phase. This also makes the precise identifi- 
cation of the RW and SARW phases more involved: 

We find that asymptotically at very high temperatures 
v approaches the value 

v T -^ 0.62 ± 0.03 

This is slightly above the mean field value and the final 
values obtained in [7] , [5] , but fully in line with the com- 
putations in [5] that revealed that the asymptotic value 
of v is reached from above as the number of monomers 
increases; We note that here we have restricted ourselves 
to consider only values of N in the range 50 — 1.000 that 
are relevant for single strand proteins, while [6 consid- 
ered self-avoiding walks with up to 80.000 steps. Con- 
sequently finite scaling corrections have a much stronger 
effect on our estimates. We also point out that as T — > oo 
only the self-avoiding condition ^ persists. Thus, in this 
limit we must be in the universality class of SARW. 

We note that for the effective monomer distance Ro we 
find in the high temperature limit the value 

Ra = 1.62 ±0.08 (A) 

that is, essentially the same as in the RW phase. 

In summary, the distinction between the collapsed 
phase and the RW phase appears very clear in our anal- 
ysis of the compactness index, and suggests the presence 
of either a first or a second order phase transition. On 
the other hand, the transition from RW phase to SARW 
phase is much more difficult to pinpoint, and it appears 
to proceed much more like a smooth cross-over transition 
than a phase transition. These observations are fully in 
line with general expectations [3], and we conclude that 
the model [H] does indeed correctly describe all the three 
phases of a polymer. 
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IV. ELASTIC ENERGY 

A. General behavior 

For the fixed parameter values that we have given in 
table [I] the free energy Q is a function of two extrin- 
sic parameters, the temperature T and the number of 
monomers N . Its numerical value can be identified as 
the elastic energy of the polymer chain. In figure [5] we 
display a three dimensional plot of (a logarithm of) the 
specific elastic energy i.e. the elastic energy per monomer 
as a function of these two parameters 



E. 



specific 



N 



(17) 



1000 




weak dependence on the number of monomers N. But 
in the high temperature SARW phase the specific energy 
becomes essentially independent of N. This is consistent 
with the expected behavior of self-avoiding random walk, 
it is driven solely by the condition (9) and no reference 
to the details of the free energy survives the infinite tem- 
perature limit. In this limit, the polymer is only subject 
to random thermal fluctuations. 



B. Critical temperature 

We have found that the dependence of the specific en- 
ergy on temperature displayed in figure [5] can be approx- 
imated with a very good accuracy by a function 



log 10 (S/iV)=F tlt (log 10 T) 
that has the following explicit form 
F Gt {x) 



(18) 



hi + h 2 arctan[/i 3 (x — Xi)] 
+/14X arctan[/i5(a; — xa)] — hex . (19) 



The parameters hi-..h 6 and x 12 arc determined by 
fitting to the numerical data at fixed value of the 
monomer number N. This explicit form yields an ex- 
cellent fit whenever there are more than around N = 100 
monomers. In figure [6] we display several examples where 
we have fitted the functional form ( 18 ), ( 19 ) to polymers 



as described by our model, where the values of N are 
between 200 and 1.000. 



FIG. 5: A three dimensional plot of a logarithm of the specific 
energy as a function of a logarithm of temperature T and the 
number of monomers N. The three phases are identified and 
the putative position of the ensuing critical temperatures are 
denoted by the vertical planes. 

In this figure we clearly identify the presence of three 
different phases that are separated from each other by 
clearly identifiable transition (critical) temperatures T cl 
and T C 2 (with T c \ < and both the low tempera- 

ture collapsed phase (T < T c i) and the medium temper- 
ature RW phase (T c \ < T < T c2 ) are characterized by 
essentially temperature independent specific energy. No- 
tice that in the collapsed phase the specific energy has 
a value that is more than one order of magnitude larger 
than in the RW phase. This is understandable, as it 
should indeed take much more energy to extend a poly- 
mer that is collapsed and resists being extended, than a 
polymer that behaves like an ideal chain and thus does 
not really care about its shape. The increase of tem- 
perature beyond T c2 leads to a transition to the SARW 
phase, which is characterized by a power-law increase of 
the specific energy as a function of the temperature: The 
larger its thermal fluctuations, the more the polymer re- 
sists to become extended. Note also that in the collapsed 
and RW phases the specific energy in figure [5] exhibits a 
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FIG. 6: The approximations (dashed lines) of the calculated 
numerical values (dots) of the specific energy by the func- 
tion (18), (19) when N = 200, 400, 600, 800, 1000. The spe- 
cific energy is a monotonically rising function of the monomer 
number. The lowest and the highest sets correspond to 
N = 200 and N = 1000, respectively. 



The fitted functional form (18), (19) allows us to pin- 
point the two critical temperatures T c \ and T C 2 . For this 
we locate the maxima of the squared logarithmic deriva- 
tive of the specific energy with respect to the logarithm 
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of temperature, 



D E (T,N) 



d\og 10 E(T,N) 
d\og w T 



(20) 



This quantity resembles susceptibility that is known to 
have its maxima at the location of critical tempera- 
ture^). The result is shown in figure [7j 
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FIG. 7: The quantity p0| | obtained from the best fits of the 
functions ( f 8 1 and ( 19 1 for various values of monomer length 
N. The vertical red lines correspond to the critical tempera- 



tures (21 1 and (22 1. The width of the lines gives the uncer- 



tainty in the definition of the critical temperatures. 



The maxima of ([20]) appear as peaks that are clearly 
visible for all values of N that we have studied and dis- 
played in figure[7j From the results in table|Tl]we estimate 
that the critical temperatures have the following values, 



log 10 T cl = 0.53 ± 0.02 , or T cl 
log 10 T c2 = 3.52 ± 0.09 , or T c2 



3.38 ±15 (21) 
3306 ±716 (22) 



Notice that the position of the hrst maximum is prac- 
tically the same for all values of N, but the larger the 
value of iV the higher the height of the maximum. This 
indicates that the transition between the collapsed phase 
and the RW phase at T = T c \ is indeed phase transition, 
which is either of the second order or of the first order; 
Our analysis is not sufficient to determine the order of 
this transition. 

On the other hand, the transition between the RW 
and SARW phases at T — T c2 is likely to be a smooth 
crossover transition since now both the position of the 
maximum and its height do not reflect any similar clearly 
localized profile with increasing values of monomer num- 
ber N. 



V. THE PHASE DEPENDENCE OF THE FREE 
ENERGY 

We have found that in each of the three phases the 
elastic energy computed from Q has its distinct, univer- 
sal dependence on the monomer number N, alternatively 
radius of gyration R g . 



N 


log 10 T cl (L) 


log 10 T c2 (L) 


200 


0.5023 


3.365 


300 


0.5114 


3.397 


400 


0.5229 


3.450 


500 


0.5402 


3.599 


600 


0.5379 


3.570 


700 


0.5671 


3.552 


800 


0.5184 


3.563 


900 


0.5360 


3.534 


1000 


0.5254 


3.638 


Avr. 


0.53(2) 


3.52(9) 



TABLE II: The critical temperatures T c i and T C 2, determined 
for each fixed number of monomers N. The averaged value is 
shown in the last row (in the bold face) along with respective 
errors. 



A. Collapsed phase 

In the collapsed phase T < T c \ the dependence of the 
free energy on the number of monomers can be described 
by the following temperature independent, logarithmi- 
cally corrected linear law: 



E(N)/E = C C onNln 



N 



(23) 



Here Eq is a parameter that defines the scale of the en- 
ergy (say) in electronvolts and must be obtained by an 
independent measurement. We find the presence of the 
logarithmic correction to scaling - as opposed to the an- 
alytic corrections proposed by ^ - to be quite notable: 
We have made a very detailed analysis of the functional 
form (23) and the logarithmic correction to scaling is 



consistently exceeding the accuracy of any power-law al- 
ternative. 

The parameters N^° n and Cc n can be calculated us- 
ing a fitting procedure. The results are shown, respec- 
tively, in figure [8j The parameter Nq o11 is essentially 
temperature independent in the low-temperature regime, 
with value 



N, 



Coll 



22 



In terms of the radius of gyration we get from (12), 



( 13 1 the approximate expression (per units of energy) 



E{Rg 



971.0 • R 



2.86 



hi 



R,i 



9.53 



(24) 



The relevant aspect of ( 24 ) is its dependence on R g . Since 
the radius of gyration scales in proportion to the end-to- 
end distance the result ( 24 1 means there is a very rapidly 



growing elastic force between the end points of the col- 
lapsed polymer in our model, in particular the elastic 
force is growing clearly more rapidly than in Hooke's law. 
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FIG. 8: The parameters of the fit ([231 
and Ccoil (the lower plot). 



Nq° 11 (the upper plot) 



Notice that according to the value of the critical tem- 
perature (21 ), the last data point in Fig. [8] (the one with 
the highest temperature value) is in the RW phase. How- 
ever, we have found that the two parameter fit (23) can 
still be applied to successfull describe this point. 



B. RW phase 
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In the RW phase we have found that the energy obeys 
the following scaling law (per units of energy) 



E(T,N) = C RW (T)N 



( N 



\N* W (T) 



-7(T)' 



(25) 



This is an example of the general form |2]) . The best fits 
of the parameters 7, A^ w and Crw are shown in figure[9] 
as functions of temperature. We find that all of these 
parameters are essentially temperature independent with 
the following average central values, 



9: The best fit parameters 7, No* and Crw of the func- 
(I25I . The horizontal lines mark the central values ( 26 1 , 



FIG. 
tion 

and the width of the lines describe the corresponding errors. 



If we use the approximation that v s» 1/2 in the RW 
phase, (26) gives us the Hooke's law with a (temperature 
dependent) correction term (per units of energy), 



7V RW = 



c 



RW 



0.355(33) 
8.3(1.5) 
1098(22) 



(26) 



These values are shown as horizontal lines in figure [9] 



E(R g ,T) 



Crw ( ^ 



1 - (iV RW )T 



-271 



(27) 
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C. SARW phase 

In the SARW phase we conclude that the energy is 
a linear function of the monomer number (per units of 
energy), 



E(N,T) = Csarw(T)N. 



(28) 



and with the mean field value of the compactness index 
i/ = 3/5 we get in terms of radius of gyration (per units 
of energy) 



E(R g ,T) « Csarw(T) ( 5£ 



5/3 



(29) 



From our data we are not able to observe any of the 
correction terms in The only fitting parameter, 

Csarw(T), is shown in Fig. (10 1 as a function of tem- 
perature. 




FIG. 10: The coefficient of the linear law (28 1 as a function 
of temperature. The das hed line illustrates the best fit ( 30 1 



with the parameters (31 1 



the transition between the RW and SARW phases is a 
crossover, there should be no clear distinction between 
these phases in the vicinity of the transition region. 



The logarithmic ( 23 1 , power ( 25 ) and linear ( 28 ) fits 



are all shown in figure (111 
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FIG. 11: The logarithmic ([23]), power ^ and linear j28) at 
various temperatures. 

Finally, we summarize the results in Fig. [12] where we 
show how the specific elastic energy (17) depends on the 
radius of gyration R g for various temperatures. The up- 
per plot of Fig. [L2] corresponds to the collapsed and RW 
phases. It is very visible that both in the collapsed phase 
and the RW phase the relation E spec ifi C = E spec ifi C (R g ) 
is indeed universal: there is no observable temperature 
dependence. We also note the rapid change from col- 



lapsed phase to RW phase. The lower plot of Fig. 12 
describes the high-temperature SARW phase. While an 
increasing function of temperature, the energy now has 
only very weak (if any) dependence on the radius of gy- 
ration. 



We also find that the temperature dependence of the 
coefficient Csarw can be described by a power law 

C S arw(T) = C T Q , (30) 

where the prefactor Co and the exponent a are 



Co 

Q 



12(4) 
0.72(6) 



(31) 



Note that according to the value of the critical temper- 
ature (22), in Figure 10 the first two points that have the 



lowest temperature values belong to the RW phase but 
they can still be described with the present fit. In fact, 
the N dependence of the free energy at these two tem- 
perature values can be fitted both by the linear law ( 28 ) 
and by the more general power law (25). However, the 



power-law fit will lead to very large error bars for the 
best fit parameters, and therefore we have not shown 
these points in Fig. [9j Moreover, since we expect that 



VI. PROTEINS AND THE HUANG-LEI 
ELASTIC ENERGY 

In [T3] the authors propose that the elastic energy of 
folded proteins in PDB can be described by the follow- 
ing phenomenological (Huang-Lei) formula (per units of 
energy) 



E HL (R g ,N) = aN 4 ^ + b(NR g ) 1 / 2 + c 



N 2 



< (32) 

Here a, b and c are fitting parameters. By minimizing 
the energy, the authors [H] compute for the compactness 
index the value 



vhl = 3/7 



(33) 



A priori this suggests [14] that folded proteins could be 
in a universality class which is different from the known 
ones p]). 
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FIG. 12: (Logarithm of) Specific elastic energy E/N vs. the 
radius of gyration R g at different temperatures. The dis- 
tinct points in the same series correspond to different nu- 
mers of monomers N. Upper plot: the low-temperature col- 
lapsed phase and medium-temperature RW phase including 
the transition region between them. Lower plot: the high- 
temperature SARW phase. The results are fully in line with 
the analytic expressions (241, (27 1 and (291, respectively. 



In this Section we shall analyze the formula ( 32 1 in the 
context of our model. We find that it gives an accurate 
description of data in our model, in particular around the 
transition point between the collapsed phase and the RW 
phase where the compactness index grows continuously 
and monotonically from around v ~ 1/3 to around v rs 
1 /2 over a finite temperature interval, due to finite scaling 
effects that are characteristic to a finite length chain: The 
value ( 33 ) corresponds to temperature value 



HL 



12.5 ± 1.7 



in our model, which suggests that we are (slightly) above 
the transition temperature T cl between collapsed and 
RW phases. 

In figure [13] we show examples where we have fitted 



( 32 1 to elastic energy computed from our model for three 
different values of the temperature: Deep in the collapsed 
phase and in the vicinity of the critical temperature T c \ 
that separates the collapsed phase from the RW phase 
in our model. The width of the best-fit lines describes 
the uncertainty in the best-fit parameters, reflecting the 



statistical errors in our data 
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FIG. 13: Three examples of the fits of the elastic energy E 
by the Huang-Lei formula @. The T = 0.0001 line is deep 
in the collapsed phase while the T = 1 and T = 3 lines are 
both in the transition region from collapsed to RW phase, in 
the vicinity of the critical value T c \ w 3.38 . 

We have found that deep in the collapsed phase the fit 
is not very good and consequently ( 32 ) does not describe 
fully collapsed proteins, as expected from the value of 
the compactness index. But when we enter the transi- 
tion region between collapsed phase and RW phase and 
the compactness index starts increasing (continuously as 
a function of temperature for finite length chains), the 
quality of the fit becomes increasingly improved and in 
the vicinity of the critical temperature T& we find for 
the statistical %-square parameter per degree of freedom 
(dof) a value around 

X 2 /(dof) « 1 

In figure [14] we summarize our findings for the set of 
best fit parameters for ( 32 ) . The red-colored zones corre- 
spond to those values of temperature where the X 2 / (dof) 
parameter is very large, typically taking values around 10 
and higher. In the un-colored (white) zones the X 2 /(dof) 
parameter has values that are in the vicinity of unity. 

In figure [l5] we show the behavior of the parameters a, 
b and c in the region where the X 2 /(dof) values are in 
the vicinity of unity that is near the transition between 
collapsed phase and RW phase, and within the RW phase. 
We have found that the temperature dependence of the 
parameters a and b can be fitted by linear functions: 



where 



a(T) = C a + 1 + 



b(T) = C b + 1 + 



C a = 8.6(1.4) • 10 3 , 
C b = -1.4(1) ■ 10 4 , 



(34) 
(35) 



T c = 216(37) , (36) 
T b = 246(44) . (37) 
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FIG. 14: The best fit parameters of the fits of the elastic 



energy (32 1 are shown the log-log scale. The description is 



given in the text. 



FIG. 15: The parameters of the fit (|32] in the RW region 
The dashed lines give are the best fits given by Eqs. (35 I 
(361, and (l37|). 



These fits are shown in Fig. [15] by the dashed lines. 

Our conclusion is that the Huang- Lei formula (32) 
gives a very good description of the elastic energy in our 
model, in particular when we are very near the transition 
point between the collapsed and RW phases, and slightly 
inside the RW phase. But it is not very accurate for tem- 
perature values that are deep in the collapsed phase, nor 
when we approach the cross-over to the SARW phase. 
We note that this is consistent with the behavior of the 
compactness index in our model as displayed in Fig. |2| 
When we compare the computed value (33) with Fig.T2^ 



Together with [T4"]. and the comparison between (15) 



we find that this value corresponds to the transition re- 
gion . 



and (16), and (14) and (21), these results suggest that 



our model should describe the statistical properties of 
folded proteins in PDB, for temperature values that are 
very close to the critical value T cl w 3.38. 



VII. DISCUSSION 

We have investigated the statistical properties of a ho- 
mopolymer model that has been introduced to describe 
the properties of collapsed proteins in Protein Data Bank. 
We have found that as a function of temperature the 
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model does indeed realize the three known phases of poly- 
mers: the collapsed phase, the random walk phase (RW), 
and the self-avoiding random walk phase (SARW). Fur- 
thermore, we have found that the model predicts that 
the transition between the collapsed phase and the ran- 
dom walk phase is a phase transition, while the random 
walk and self-avoiding random walk phases are separated 
from each other by a smooth cross-over transition. These 
findings are in line with general arguments on the phase 
structure of polymers [3] . 

We have also computed the elastic energy as a func- 
tion of radius of gyration i.e. end-to-end distance of a 
polymer. In the collapsed phase we have found that the 
energy grows faster than in Hooke's law, in the RW phase 
we find Hooke's law with temperature dependent correc- 
tions, and finally in the SARW phase we find that the de- 
pendency of energy on the radius of gyrations is weaker 
than in Hooke's law. It would be interesting to test our 
predictions experimentally in the case of proteins, for ex- 
ample using atomic force microscopy. 



Finally, we have compared our model with a phe- 
nomenological expression that has been introduced by 
Huang and Lei to describe the elastic energy of collapsed 
proteins. We have found that the Huang-Lei formula 
gives a good effective description of our model, in par- 
ticular when we arc in the vicinity of the transition re- 
gion that separates the collapsed phase from the random 
walk phase. This is also consistent with our evaluation 
of the temperature dependence of the compactness in- 
dex. When compared with the PDB data this suggests 
that statistical properties of collapsed proteins are in- 
deed described by the present model in the vicinity of 
this transition point. 
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